Binary Neural Networks Algorithms, Architectures, and Applications (Baochang Zhang, Sheng Xu, Mingbao Lin etc.)

Principal Methods

ABC-Net [147] is another network designed to improve the performance of binary net-

works. ABC-Net approximates the full precision weight ﬁlter W with a linear combination

of M binary ﬁlters B1, B2, ..., BM ∈{+1, −1} such that W ≈α1β1 + ... + αMβM. These

binary ﬁlters are ﬁxed as follows:

Bi = Fui(W) := sign( ^¯W + uistd(W)), i = 1, 2, ..., M,

(1.11)

where ^¯W and std(W) are the mean and standard derivation of W, respectively. For acti-

vations, ABC-Net employs multiple binary activations to alleviate information loss. Like

the binarization weights, the real activation I is estimated using a linear combination of N

activations A1, A2, ..., AN such that I = β1A1 + ... + βNAN, where

A1, A2, ..., AN = Hv1(R), Hv2(R), ..., HvN (R).

(1.12)

H(R) in Eq. 4.35 is a binary function, h is a bounded activation function, I is the

indicator function, and v is a shift parameter. Unlike the input weights, the parameters

β and v are trainable. Without explicit linear regression, the network tunes β^′

n^s^and^v^′

n^s

during training and is ﬁxed for testing. They are expected to learn and utilize the statistical

features of full-precision activations.

Ternary-Binary Network (TBN) [228] is a CNN with ternary inputs and binary weights.

Based on accelerated ternary-binary matrix multiplication, TBN uses eﬃcient operations

such as XOR, AND, and bit count in standard CNNs, and thus provides an optimal trade-

oﬀbetween memory, eﬃciency, and performance. Wang et al. [233] propose a simple yet

eﬀective two-step quantization framework (TSQ) by decomposing network quantization into

two steps: code learning and transformation function learning based on codes learned. TSQ

ﬁts primarily into the class of 2-bit neural networks.

Local Binary Convolutional Network (LBCNN) [109] proposes a local binary convolution

(LBC), which is motivated by local binary patterns (LBP), a descriptor of images rooted

in the face recognition community. The LBC layer has a set of ﬁxed, sparse predeﬁned

binary convolutional ﬁlters that are not updated during the training process, a non-linear

activation function, and a set of learnable linear weights. The linear weights combine the

activated ﬁlter responses to approximate a standard convolutional layer’s corresponding

activated ﬁlter responses. The LBC layer often aﬀords signiﬁcant parameter savings of 9x

to 169x fewer learnable parameters than a standard convolutional layer. Furthermore, the

sparse and binary nature of the weights also results in up to 169x savings in model size

compared to a conventional convolution.

Modulated Convolutional Networks (MCN) [236] ﬁrst introduce modulation ﬁlters (M-

Filters) to recover the binarized ﬁlters. M-Filters are designed to approximate unbinarized

convolutional ﬁlters in an end-to-end framework. Each layer shares only one M-Filter, lead-

ing to a signiﬁcant reduction in model size. To reconstruct the unbinarized ﬁlters, they

introduce a modulated process based on the M-Filters and binarized ﬁlters. Figure 1.1 is an

example of the modulation process. In this example, the M-Filter has four planes, each of

which can be expanded to a 3D matrix according to the channels of the binarized ﬁlter. After

the ◦operation between the binarized ﬁlter and each expanded M-Filter, the reconstructed

ﬁlter Q is obtained.

As shown in Fig. 1.2, the reconstructed ﬁlters Q are used to calculate the output feature

maps F. There are four planes in Fig. 1.2, so the number of channels in the feature maps

is also 4. Using MCNs convolution, every feature map’s input and output channels are the

same, allowing the module to be replicated and the MCNs to be easily implemented.

Unlike previous work in which the model binarizes each ﬁlter independently, Bulat et al.

[23] propose parameterizing each layer’s weight tensor using a matrix or tensor decomposi-

tion. The binarization process uses latent parametrization through a quantization function